Best Practices and Tips

Minimize the solution space

In optimization problems one is looking for an optimal or near optimal solution from a set of possible input values. For problems with a low complexity the total number of possible permutations of valid input may be able to be completely enumerated. Consider a steady state problem where 2 pumps can be either on or off. If we represent the on state with the number 1 and the off state with the number 0, using the following notation (1, 1) we indicate that both pumps are on. One trial solution in such a problem is (1, 0). Clearly there are 4 possible permutations in this problem, the other three being (0, 1), (0, 0) and (1, 1). The set of all possible permutations of input is known as the solution space. Even if a single permutation of input or trial solution took an hour to evaluate, the entire solution space could be enumerated in 4 hours, making it practical to do so provided that the optimal solution is not required to be known in less than that time. The solution space for this 2 pump problem is size 2^2 or 4. The solution space for an equivalent 10 pump problem is 2^10 or 1,024. What is not immediately obvious, however, is that the size of the solution space in optimization problems can quickly grow to mind boggling sizes.

For example, let us consider a pump schedule optimization problem with 10 pumps and an EPS of 24 hours duration with a hydraulic time step of 1 hour. In addition to this, let's assume the pumps are optimized as variable speed with possible settings of 0.8, 0.85, 0.9, 0.95 and 1.0. Assuming the pumps are all optimized for the entire duration of the EPS (time 0 to time 24 hours) then there are 10 x 24 = 240 speed decisions to be made for each trial solution, and each of those decisions can take on one of 5 different values. Even for this modest sounding optimization problem the size of the solution space is thus 5^240 or 5.65 x 10^167! Now let's assume that we can easily write off 99.99% of solutions as not practical or plain non-sense, then that leaves just 5.65 x 10^163 solutions for us to investigate. If we could evaluate one million trial solutions every second, it would still take 1.79 x 10^150 years to evaluate them all! One public estimate of the number of atoms in the entire observable universe is 10^80, which is virtually zero when compared to 1.79 x 10^150, so quite clearly we are talking about numbers that are so large they are difficult if not impossible to comprehend. A small increase in complexity of the problem magnifies the total number of possible solutions greatly. Conversely a small decrease in problem complexity reduces the total number of possible solutions greatly. It is therefore a very good idea to consider the following when setting up a pump scheduling optimization problem.

Number of pumps being optimized; keep the number of pumps being considered to the minimum possible, to the point of considering optimizing different pump stations independently if that is a reasonable thing to do hydraulically in the system being optimized.
Number of pump speed choices; keep the number of possible speed choices (including off setting) to the minimum possible. Consider optimizing with course speed settings to find a rough solution to the optimization problem and follow that up with an optimization that uses refined speed settings (finer, but narrower range) as a follow up optimization to the first.
Schedule control interval (EPS hydraulic time step); consider using a course hydraulic time step such as 2 or even 3 hours at least for initial optimization runs as this greatly reduces the size of the solution space, especially if multiple pumps are being optimized.
Schedule duration; consider optimizing the shortest EPS duration possible. A 24 hour duration seems to be the most reasonable choice in terms of being able to produce a repeatable schedule, whilst keeping the solution space as small as possible.

The following table shows the size of the solution space given different numbers of pumps being optimized (Pump Count), numbers of speed choices per pump (Speed Choices) and EPS time step. It is very evident the effect that increasing the number of pumps being optimized, the number of speed choices or the granularity of the EPS time step each have an exponential effect on the size of the solution space, and thus inevitably reduce the effectiveness of the optimization. When running an optimization it is wise to start out conservatively and only increase the optimization complexity to refine optimization results.

Table 14-1: The effect on optimization solution space of number of pumps to optimize, number of speed choices and EPS time step (control interval).

Pump Count	Speed Choices	Solution Space (1 hour time step)	Solution Space (2 hour time step)	Solution Space (3 hour time step)
1	6	4.7E+18	2.2E+09	1.7E+06
1	12	7.9E+25	8.9E+12	4.3E+08
1	18	1.3E+30	1.2E+15	1.1E+10
2	6	2.2E+37	4.7E+18	2.8E+12
2	12	6.3E+51	7.9E+25	1.8E+17
2	18	1.8E+60	1.3E+30	1.2E+20
3	6	1.1E+56	1.0E+28	4.7E+18
3	12	5.0E+77	7.1E+38	7.9E+25
3	18	2.4E+90	1.5E+45	1.3E+30
4	6	5.0E+74	2.2E+37	8.0E+24
4	12	4.0E+103	6.3E+51	3.4E+34
4	18	3.2E+120	1.8E+60	1.5E+40

Minimize the trial solution time

In our discussion of minimizing the solution space we consider the time required to enumerate the top 0.001% of trial solutions by assuming that we can evaluate one million trials per second. Clearly this figure is un-realistic even on today's fastest computers and for the most trivial of hydraulic models, so it's clear that the time the model takes to solve is a significant contributor to the total time required to run Darwin Scheduler. Any improvement that can be made to the run-time of the base EPS simulation all the better for the Darwin Scheduler optimized run time. Methods to reduce run time that should be considered include:

Model size: The more hydraulic elements in a model the larger the solution matrix that needs to be solved and the longer the run-time of the solution. It is unrealistic to expect to be able to use Darwin Scheduler on a 50,000 pipe model in a few minutes if a single EPS run for such a model takes a few minutes. Strongly consider using a version or copy of the subject model that is customized for the purpose of pumping optimization. Such a model might be smaller due to excluding elements or zones etc not required for the energy optimization or it may be smaller due to skeletonization (removal) of hydraulic elements not required to be considered in the energy optimization. In fact a skeletonized model is highly recommended for pump schedule optimization, particularly if the model is skeletonized whilst maintaining hydraulic equivalence such as is able to be performed using Skelebrator Skeletonizer. The benefit of the smaller model and quicker run time will greatly outweigh any potential or perceived side effect (if any at all) of the skeltonization process.
Model complexity: The larger the model or more complex the model (e.g., complicated control regimes) the longer an EPS simulation will take to run due to the need to simulate additional intermediate time steps (such as times when control rules fire). Consider removing any redundant model complexity that may not be required for a pump operation simulation.
Model balance: Even a small model may take a long time to run if it is not well balanced. Examine the number of trials the model takes to solve at each time step and if it is found that it is consistently high (25-100+) then there may be time to be saved by improving this situation. A high number of trials may be indicative of a number of different symptoms such as bade control valve settings or too narrow control ranges.

Use a faster computer

These days most computers are reasonably fast, however, time is money in which case a faster computer can save both time and money. The Darwin Scheduler optimization process is computationally expensive and as such a computer with a faster CPU will produce faster results. Multi-core machines will also benefit from increased overall performance.

Carefully consider hydraulic constraints

If certain hydraulic constraints are required to be met it is a good idea to consider these carefully and only add the constraints that are essential as opposed to adding blanket constraints. Adding blanket constraints, especially for large models, is discouraged since blanket constraints are more likely to contain impossible to meet constraints (such as pressure constraints on a junction that is suction side of a pump) and will also have a slight effect on performance (constraints have to be evaluated for every trial solution) and increase Darwin Scheduler's output file size unnecessarily. For this reason Darwin Scheduler is designed to require the user to add constraints manually.

Ensure runs are set up properly

Even for a small well balanced model run times for Darwin Scheduler will be proportional to the time a single EPS takes to run, multiplied by the number of trials required to find a near optimal solution. It is therefore a good idea to ensure that a run is progressing in an acceptable fashion in its early stages (generation 50 - 200) before leaving it to run for the remainder of the optimization. Be sure to leverage Darwin Scheduler's resume feature that allows one to stop a run, review the results (even export the solution) and then continue the run so long as no other runs have been started or no other hydraulic computation has been performed.

Plan to use the tool efficiently

One good thing about computers is that they don't need to sleep like people do. When working with larger models that may require a longer run time consider running shorter debugging optimization runs during the day, making necessary adjustments and the like, and then running the "real" runs during a lunch break or perhaps even over-night.

Allow runs sufficient time to complete

One characteristic of genetic algorithm optimization is the need for heuristic stopping criteria. In Darwin Scheduler several different criteria are available depending on the type of genetic algorithm selected. There is, however, no definitive way to determine when a run should be stopped. Running just one more generation may yield a better solution than previously found. Generally speaking, however, optimization runs should be allowed to run for at least 500 generations (preferably longer) which, depending on population size, can mean the order of 100,000+ trials. Please be patient!

Plan to do multiple runs

The nature of genetic algorithm optimization is such that there is a random component to the algorithm. The algorithm is driven by computationally efficient search processes; however, at the core of the algorithm random numbers are used to drive processes such as mutation, for example. Therefore, two optimization runs that are otherwise identical except for one minor change (e.g., larger population size or different random seed) will in all likelihood produce different optimized solutions. This is more likely to be the case the larger the solution space of the problem. It is therefore a good idea to run multiple optimization runs changing nothing other than one or more genetic algorithm parameters (or simply just the random seed) to ensure that the best optimized solution is really the best that can be achieved. One beneficial characteristic of genetic algorithm optimization is its ability to find solutions that my be very close in terms of hydraulic performance, but may be themselves quite different. Engineers are therefore able to discriminate between optimized solutions based on other perhaps non hydraulic criteria.

You can also leverage an existing solution (such as the representative scenario, assuming it meets constraints) to create a Baseline Seed for scheduler to use. Export the results of a Scheduler run to a new scenario, then calculate an EPS run for the new scenario. Use this scenario as Scheduler's representative scenario to seed a new Scheduler run.